{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "5461a818",
   "metadata": {},
   "source": [
    "[![Dataflowr](https://raw.githubusercontent.com/dataflowr/website/master/_assets/dataflowr_logo.png)](https://dataflowr.github.io/website/)\n",
    "\n",
    "Make sure you are familiar with the material of [Module 2a - Pytorch tensors](https://dataflowr.github.io/website/modules/2a-pytorch-tensors/) and [Module 2b - Automatic differentiation](https://dataflowr.github.io/website/modules/2b-automatic-differentiation/) before starting this homework."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b8ed2a1",
   "metadata": {},
   "source": [
    "# Homework 1: MLP from scratch\n",
    "\n",
    "In this homework, you will code a [Multilayer perceptron](https://en.wikipedia.org/wiki/Multilayer_perceptron) with one hidden layer to classify cloud of points in 2D.\n",
    "\n",
    "Make sure you fill in any place that says `YOUR CODE HERE`.\n",
    "\n",
    "Advice:\n",
    "- As much as possible, please try to make matrix and vector operations (good practice for efficient code)\n",
    "- If you're not familiar with numpy, check the documentation of `np.max`, `np.clip`, `np.random.randn`, `np.reshape`. FYI the matrix multiplication operator is `@`, and you may want to learn about [broadcasting rules](https://numpy.org/doc/stable/user/basics.broadcasting.html) to see how it deals with tensor operations of different sizes\n",
    "- You can also check about `torch.clamp`, `torch.nn.Parameter`\n",
    "\n",
    "## 1. Some utilities and your dataset\n",
    "\n",
    "You should not modify the code in this section"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a72dc18c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# all of these libraries are used for plotting\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# Plot the dataset\n",
    "def plot_data(ax, X, Y):\n",
    "    plt.axis('off')\n",
    "    ax.scatter(X[:, 0], X[:, 1], s=1, c=Y, cmap='bone')\n",
    "\n",
    "from sklearn.datasets import make_moons\n",
    "X, Y = make_moons(n_samples=2000, noise=0.1)\n",
    "\n",
    "%matplotlib inline\n",
    "x_min, x_max = -1.5, 2.5\n",
    "y_min, y_max = -1, 1.5\n",
    "fig, ax = plt.subplots(1, 1, facecolor='#4B6EA9')\n",
    "ax.set_xlim(x_min, x_max)\n",
    "ax.set_ylim(y_min, y_max)\n",
    "plot_data(ax, X, Y)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8685399b",
   "metadata": {},
   "source": [
    "This is your dataset: two moons each one corresponding to one class (black or white in the picture above).\n",
    "\n",
    "In order to make it more fun and illustrative, the code below allows you to see the decision boundary of your classifier. Unfortunately, animation is not working on colab..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a897bcfd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the grid on which we will evaluate our classifier\n",
    "xx, yy = np.meshgrid(np.arange(x_min, x_max, .1),\n",
    "                     np.arange(y_min, y_max, .1))\n",
    "\n",
    "to_forward = np.array(list(zip(xx.ravel(), yy.ravel())))\n",
    "\n",
    "# plot the decision boundary of our classifier\n",
    "def plot_decision_boundary(ax, X, Y, classifier):\n",
    "    # forward pass on the grid, then convert to numpy for plotting\n",
    "    Z = classifier.forward(to_forward)\n",
    "    Z = Z.reshape(xx.shape)\n",
    "    \n",
    "    # plot contour lines of the values of our classifier on the grid\n",
    "    ax.contourf(xx, yy, Z>0.5, cmap='Blues')\n",
    "    \n",
    "    # then plot the dataset\n",
    "    plot_data(ax, X,Y)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94c47efa",
   "metadata": {},
   "source": [
    "## 2. MLP in numpy\n",
    "\n",
    "Here you need to code your implementation of the [ReLU](https://en.wikipedia.org/wiki/Rectifier_(neural_networks) activation and the [Sigmoid](https://en.wikipedia.org/wiki/Sigmoid_function)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d85e14ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "class MyReLU(object):\n",
    "    def forward(self, x):\n",
    "        # the relu is y_i = max(0, x_i)\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "        \n",
    "    \n",
    "    def backward(self, grad_output):\n",
    "        # the gradient is 1 for the inputs that were above 0, 0 elsewhere\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "    \n",
    "    def step(self, learning_rate):\n",
    "        # no need to do anything here, since ReLU has no parameters\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "\n",
    "class MySigmoid(object):\n",
    "    def forward(self, x):\n",
    "        # the sigmoid is y_i = 1./(1+exp(-x_i))\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "    \n",
    "    def backward(self, grad_output):\n",
    "        # the partial derivative is e^-x / (e^-x + 1)^2\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "    \n",
    "    def step(self, learning_rate):\n",
    "        # no need to do anything here since Sigmoid has no parameters\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c790c022",
   "metadata": {},
   "source": [
    "Probably a good time to test your functions..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ab05951",
   "metadata": {},
   "outputs": [],
   "source": [
    "test_relu = MyReLU()\n",
    "test_relu.forward(X[10])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca26d423",
   "metadata": {},
   "outputs": [],
   "source": [
    "test_relu.backward(np.ones(1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "988432fb",
   "metadata": {},
   "outputs": [],
   "source": [
    "test_sig = MySigmoid()\n",
    "\n",
    "test_sig.forward(np.ones(1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "44234fdc",
   "metadata": {},
   "outputs": [],
   "source": [
    "test_sig.backward(np.ones(1))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43b8561c",
   "metadata": {},
   "source": [
    "A bit more complicated, you need now to implement your linear layer i.e. multiplication by a matrix W and summing with a bias b."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3457d41b",
   "metadata": {},
   "outputs": [],
   "source": [
    "class MyLinear(object):\n",
    "    def __init__(self, n_input, n_output):\n",
    "        # initialize two random matrices for W and b (use np.random.randn)\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "\n",
    "    def forward(self, x):\n",
    "        # save a copy of x, you'll need it for the backward\n",
    "        # return xW + b\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "\n",
    "    def backward(self, grad_output):\n",
    "        # y_i = \\sum_j x_j W_{j,i}  + b_i\n",
    "        # d y_i / d W_{j, i} = x_j\n",
    "        # d loss / d y_i = grad_output[i]\n",
    "        # so d loss / d W_{j,i} = x_j * grad_output[i]  (by the chain rule)\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "        \n",
    "        # d y_i / d b_i = 1\n",
    "        # d loss / d y_i = grad_output[i]\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "        \n",
    "        # now we need to compute the gradient with respect to x to continue the back propagation\n",
    "        # d y_i / d x_j = W_{j, i}\n",
    "        # to compute the gradient of the loss, we have to sum over all possible y_i in the chain rule\n",
    "        # d loss / d x_j = \\sum_i (d loss / d y_i) (d y_i / d x_j)\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "    \n",
    "    def step(self, learning_rate):\n",
    "        # update self.W and self.b in the opposite direction of the stored gradients, for learning_rate\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c265a80d",
   "metadata": {},
   "source": [
    "As we did in practicals, you need now to code your network (what we called my_composition in the [practicals](https://github.com/dataflowr/notebooks/blob/master/Module2/02_backprop.ipynb)). Recall with a Sigmoid layer, you should use the BCE loss."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4da1ba99",
   "metadata": {},
   "outputs": [],
   "source": [
    "class Sequential(object):\n",
    "    def __init__(self, layers):\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "        \n",
    "    def forward(self, x):\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "    \n",
    "    def compute_loss(self, out, label):\n",
    "        # use the BCE loss\n",
    "        # -(label * log(output) + (1-label) * log(1-output))\n",
    "        # save the gradient, and return the loss      \n",
    "        # beware of dividing by zero in the gradient.\n",
    "        # split the computation in two cases, one where the label is 0 and another one where the label is 1\n",
    "        # add a small value (1e-10) to the denominator\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "\n",
    "    def backward(self):\n",
    "        # apply backprop sequentially, starting from the gradient of the loss\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "    \n",
    "    def step(self, learning_rate):\n",
    "        # take a gradient step for each layers\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "078008a9",
   "metadata": {},
   "outputs": [],
   "source": [
    "h=50\n",
    "\n",
    "# define your network with your Sequential\n",
    "# it should be a linear layer with 2 inputs and h outputs, followed by a ReLU\n",
    "# then a linear layer with h inputs and 1 outputs, followed by a sigmoid\n",
    "# feel free to try other architectures\n",
    "\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1eabc519",
   "metadata": {},
   "outputs": [],
   "source": [
    "# unfortunately animation is not working on colab\n",
    "# you should comment the following line if on colab\n",
    "%matplotlib notebook\n",
    "fig, ax = plt.subplots(1, 1, facecolor='#4B6EA9')\n",
    "ax.set_xlim(x_min, x_max)\n",
    "ax.set_ylim(y_min, y_max)\n",
    "losses = []\n",
    "learning_rate = 1e-2\n",
    "for it in range(10000):\n",
    "    # pick a random example id\n",
    "    j = np.random.randint(1, len(X))\n",
    "\n",
    "    # select the corresponding example and label\n",
    "    example = X[j:j+1]\n",
    "    label = Y[j]\n",
    "\n",
    "    # do a forward pass on the example\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "    # compute the loss according to your output and the label\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "    \n",
    "    # backward pass\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "    \n",
    "    # gradient step\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "    # draw the current decision boundary every 250 examples seen\n",
    "    if it % 250 == 0 : \n",
    "        plot_decision_boundary(ax, X,Y, net)\n",
    "        fig.canvas.draw()\n",
    "plot_decision_boundary(ax, X,Y, net)\n",
    "fig.canvas.draw()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "33c3b709",
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "plt.plot(losses)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9056ec69",
   "metadata": {},
   "source": [
    "## 3. Using a Pytorch module\n",
    "\n",
    "In this last part, use `toch.nn.Module` to recode `MyLinear` and `MyReLU` so that these modules will be pytorch compatible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7cd895de",
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "\n",
    "# y = xw + b\n",
    "class MyLinear_mod(nn.Module):\n",
    "    def __init__(self, n_input, n_output):\n",
    "        super(MyLinear_mod, self).__init__()\n",
    "        # define self.A and self.b the weights and biases\n",
    "        # initialize them with a normal distribution\n",
    "        # use nn.Parameters\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "\n",
    "    def forward(self, x):\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()\n",
    "        \n",
    "class MyReLU_mod(nn.Module):\n",
    "    def __init__(self):\n",
    "        super(MyReLU_mod, self).__init__()\n",
    "        \n",
    "    def forward(self, x):\n",
    "        # YOUR CODE HERE\n",
    "        raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "63df73dc",
   "metadata": {},
   "outputs": [],
   "source": [
    "# the grid for plotting the decision boundary should be now made of tensors.\n",
    "to_forward = torch.from_numpy(np.array(list(zip(xx.ravel(), yy.ravel())))).float()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e595c06b",
   "metadata": {},
   "source": [
    "Define your network using `MyLinear_mod`, `MyReLU_mod` and [`nn.Sigmoid`](https://pytorch.org/docs/stable/nn.html#sigmoid)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4cbd0d0d",
   "metadata": {},
   "outputs": [],
   "source": [
    "h=50\n",
    "\n",
    "# define your network with nn.Sequential\n",
    "# use MyLinear_mod, MyReLU_mod and nn.Sigmoid (from pytorch)\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "95a05b4a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from torch import optim\n",
    "optimizer = optim.SGD(net.parameters(), lr=1e-2)\n",
    "\n",
    "X_torch = torch.from_numpy(X).float()\n",
    "Y_torch = torch.from_numpy(Y).float()\n",
    "\n",
    "# you should comment the following line if on colab\n",
    "%matplotlib notebook\n",
    "fig, ax = plt.subplots(1, 1, facecolor='#4B6EA9')\n",
    "ax.set_xlim(x_min, x_max)\n",
    "ax.set_ylim(y_min, y_max)\n",
    "\n",
    "losses = []\n",
    "criterion = nn.BCELoss()\n",
    "for it in range(10000):\n",
    "    # pick a random example id \n",
    "    j = np.random.randint(1, len(X))\n",
    "\n",
    "    # select the corresponding example and label\n",
    "    example = X_torch[j:j+1]\n",
    "    label = Y_torch[j:j+1].unsqueeze(1)\n",
    "\n",
    "    # do a forward pass on the example\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "    # compute the loss according to your output and the label\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "    # zero the gradients\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "    # backward pass\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "    # gradient step\n",
    "    # YOUR CODE HERE\n",
    "    raise NotImplementedError()\n",
    "\n",
    "    # draw the current decision boundary every 250 examples seen\n",
    "    if it % 250 == 0 : \n",
    "        plot_decision_boundary(ax, X,Y, net)\n",
    "        fig.canvas.draw()\n",
    "plot_decision_boundary(ax, X,Y, net)\n",
    "fig.canvas.draw()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9de296f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "plt.plot(losses)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0d445241",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "dldiy",
   "language": "python",
   "name": "dldiy"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
